Outlier Detection Using K-Mean and Hybrid Distance Technique on Multi-Dimensional Data Set
نویسندگان
چکیده
Outlier Detection is a major issue in data mining. Outliers are the containments that divert from the other objects. Outlier detection is used to make the data knowledgeable, and easy to understand. There are many type of databases used now days, and many of them contains anomaly objects, detection or removal of these objects is known as outlier detection. In the proposed work outliers are detected by partitioning the dataset with the clustering method that is the K – Mean method using the Mean of Euclidean and Manhattan distance and then find out the outlier with the Hybrid technique that is the mean of the Euclidean and Manhattan Distance. The proposed work is highly efficient in detection of outliers and produces much efficient outliers by using the real bench marked data sets: Iris dataset and Pima Indian Diabetes data set. Index Terms Outliers, Euclidean Distance, Manhattan Distance, Hybrid Technique
منابع مشابه
The Hybrid Approach for Handling and Detecting Outliers from Dynamic Data Stream
The Outlier detection is currently area of active research in data set mining community. In this article we propose hybrid approach to capture outliers in dynamic data stream. We apply k-mean algorithm which Partition the data set into number of chunks or clusters. Each chunk contains set of data. Once cluster are formed, centroid of each cluster are calculated. The points which are lying near ...
متن کاملA Review on Detection of Outliers Over High Dimensional Streaming Data Using Cluster Based Hybrid Approach
Finding Outlier detection in data streams has gained broad importance presently due to the increasing cases of fraud in various applications of data streams, data cleaning, network monitoring, invasive species monitoring, stock market analysis, detecting outlying cases inmedical data etc. Finding outliers in a collection of patterns is a very well-known problem in the data mining field. An outl...
متن کاملOutlier Detection in Dataset using Hybrid Approach
Outlier is a data point that deviates too much from the rest of dataset. Most of real-world dataset have outlier. Outlier analysis is one of the techniques in data mining whose task is to discover the data which have an exceptional behavior compare to remaining dataset. Outlier detection plays an important role in data mining field. Outlier Detection is useful in many fields like Medical, Netwo...
متن کاملOutlier Detection for Support Vector Machine using Minimum Covariance Determinant Estimator
The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations in the aforementioned points, will result in deviation from the correct decision. Thus...
متن کاملDetecting Suspicious Card Transactions in unlabeled data of bank Using Outlier Detection Techniqes
With the advancement of technology, the use of ATM and credit cards are increased. Cyber fraud and theft are the kinds of threat which result in using these Technologies. It is therefore inevitable to use fraud detection algorithms to prevent fraudulent use of bank cards. Credit card fraud can be thought of as a form of identity theft that consists of an unauthorized access to another person's ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013